Search CORE

8 research outputs found

Real-time operating system support for multicore applications

Author: Gracioli Giovani
Publication venue
Publication date: 01/01/2014
Field of study

Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia de Automação e Sistemas, Florianópolis, 2014Plataformas multiprocessadas atuais possuem diversos níveis da memória cache entre o processador e a memória principal para esconder a latência da hierarquia de memória. O principal objetivo da hierarquia de memória é melhorar o tempo médio de execução, ao custo da previsibilidade. O uso não controlado da hierarquia da cache pelas tarefas de tempo real impacta a estimativa dos seus piores tempos de execução, especialmente quando as tarefas de tempo real acessam os níveis da cache compartilhados. Tal acesso causa uma disputa pelas linhas da cache compartilhadas e aumenta o tempo de execução das aplicações. Além disso, essa disputa na cache compartilhada pode causar a perda de prazos, o que é intolerável em sistemas de tempo real críticos. O particionamento da memória cache compartilhada é uma técnica bastante utilizada em sistemas de tempo real multiprocessados para isolar as tarefas e melhorar a previsibilidade do sistema. Atualmente, os estudos que avaliam o particionamento da memória cache em multiprocessadores carecem de dois pontos fundamentais. Primeiro, o mecanismo de particionamento da cache é tipicamente implementado em um ambiente simulado ou em um sistema operacional de propósito geral. Consequentemente, o impacto das atividades realizados pelo núcleo do sistema operacional, tais como o tratamento de interrupções e troca de contexto, no particionamento das tarefas tende a ser negligenciado. Segundo, a avaliação é restrita a um escalonador global ou particionado, e assim não comparando o desempenho do particionamento da cache em diferentes estratégias de escalonamento. Ademais, trabalhos recentes confirmaram que aspectos da implementação do SO, tal como a estrutura de dados usada no escalonamento e os mecanismos de tratamento de interrupções, impactam a escalonabilidade das tarefas de tempo real tanto quanto os aspectos teóricos. Entretanto, tais estudos também usaram sistemas operacionais de propósito geral com extensões de tempo real, que afetamos sobre custos de tempo de execução observados e a escalonabilidade das tarefas de tempo real. Adicionalmente, os algoritmos de escalonamento tempo real para multiprocessadores atuais não consideram cenários onde tarefas de tempo real acessam as mesmas linhas da cache, o que dificulta a estimativa do pior tempo de execução. Esta pesquisa aborda os problemas supracitados com as estratégias de particionamento da cache e com os algoritmos de escalonamento tempo real multiprocessados da seguinte forma. Primeiro, uma infraestrutura de tempo real para multiprocessadores é projetada e implementada em um sistema operacional embarcado. A infraestrutura consiste em diversos algoritmos de escalonamento tempo real, tais como o EDF global e particionado, e um mecanismo de particionamento da cache usando a técnica de coloração de páginas. Segundo, é apresentada uma comparação em termos da taxa de escalonabilidade considerando o sobre custo de tempo de execução da infraestrutura criada e de um sistema operacional de propósito geral com extensões de tempo real. Em alguns casos, o EDF global considerando o sobre custo do sistema operacional embarcado possui uma melhor taxa de escalonabilidade do que o EDF particionado com o sobre custo do sistema operacional de propósito geral, mostrando claramente como diferentes sistemas operacionais influenciam os escalonadores de tempo real críticos em multiprocessadores. Terceiro, é realizada uma avaliação do impacto do particionamento da memória cache em diversos escalonadores de tempo real multiprocessados. Os resultados desta avaliação indicam que um sistema operacional "leve" não compromete as garantias de tempo real e que o particionamento da cache tem diferentes comportamentos dependendo do escalonador e do tamanho do conjunto de trabalho das tarefas. Quarto, é proposto um algoritmo de particionamento de tarefas que atribui as tarefas que compartilham partições ao mesmo processador. Os resultados mostram que essa técnica de particionamento de tarefas reduz a disputa pelas linhas da cache compartilhadas e provê garantias de tempo real para sistemas críticos. Finalmente, é proposto um escalonador de tempo real de duas fases para multiprocessadores. O escalonador usa informações coletadas durante o tempo de execução das tarefas através dos contadores de desempenho em hardware. Com base nos valores dos contadores, o escalonador detecta quando tarefas de melhor esforço o interferem com tarefas de tempo real na cache. Assim é possível impedir que tarefas de melhor esforço acessem as mesmas linhas da cache que tarefas de tempo real. O resultado desta estratégia de escalonamento é o atendimento dos prazos críticos e não críticos das tarefas de tempo real.Abstracts: Modern multicore platforms feature multiple levels of cache memory placed between the processor and main memory to hide the latency of ordinary memory systems. The primary goal of this cache hierarchy is to improve average execution time (at the cost of predictability). The uncontrolled use of the cache hierarchy by realtime tasks may impact the estimation of their worst-case execution times (WCET), specially when real-time tasks access a shared cache level, causing a contention for shared cache lines and increasing the application execution time. This contention in the shared cache may leadto deadline losses, which is intolerable particularly for hard real-time (HRT) systems. Shared cache partitioning is a well-known technique used in multicore real-time systems to isolate task workloads and to improve system predictability. Presently, the state-of-the-art studies that evaluate shared cache partitioning on multicore processors lack two key issues. First, the cache partitioning mechanism is typically implemented either in a simulated environment or in a general-purpose OS (GPOS), and so the impact of kernel activities, such as interrupt handlers and context switching, on the task partitions tend to be overlooked. Second, the evaluation is typically restricted to either a global or partitioned scheduler, thereby by falling to compare the performance of cache partitioning when tasks are scheduled by different schedulers. Furthermore, recent works have confirmed that OS implementation aspects, such as the choice of scheduling data structures and interrupt handling mechanisms, impact real-time schedulability as much as scheduling theoretic aspects. However, these studies also used real-time patches applied into GPOSes, which affects the run-time overhead observed in these works and consequently the schedulability of real-time tasks. Additionally, current multicore scheduling algorithms do not consider scenarios where real-time tasks access the same cache lines due to true or false sharing, which also impacts the WCET. This thesis addresses these aforementioned problems with cache partitioning techniques and multicore real-time scheduling algorithms as following. First, a real-time multicore support is designed and implemented on top of an embedded operating system designed from scratch. This support consists of several multicore real-time scheduling algorithms, such as global and partitioned EDF, and a cache partitioning mechanism based on page coloring. Second, it is presented a comparison in terms of schedulability ratio considering the run-time overhead of the implemented RTOS and a GPOS patched with real-time extensions. In some cases, Global-EDF considering the overhead of the RTOS is superior to Partitioned-EDF considering the overhead of the patched GPOS, which clearly shows how different OSs impact hard realtime schedulers. Third, an evaluation of the cache partitioning impacton partitioned, clustered, and global real-time schedulers is performed.The results indicate that a lightweight RTOS does not impact real-time tasks, and shared cache partitioning has different behavior depending on the scheduler and the task's working set size. Fourth, a task partitioning algorithm that assigns tasks to cores respecting their usage of cache partitions is proposed. The results show that by simply assigning tasks that shared cache partitions to the same processor, it is possible to reduce the contention for shared cache lines and to provideHRT guarantees. Finally, a two-phase multicore scheduler that provides HRT and soft real-time (SRT) guarantees is proposed. It is shown that by using information from hardware performance counters at run-time, the RTOS can detect when best-effort tasks interfere with real-time tasks in the shared cache. Then, the RTOS can prevent best effort tasks from interfering with real-time tasks. The results also show that the assignment of exclusive partitions to HRT tasks together with the two-phase multicore scheduler provides HRT and SRT guarantees, even when best-effort tasks share partitions with real-time tasks

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositório Institucional da UFSC

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Fixed-Priority Memory-Centric Scheduler for COTS-Based Multiprocessors

Author: Bertogna Marko
Caccamo Marco
Gracioli Giovani
Kloda Tomasz
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 32nd Euromicro Conference on Real-Time Systems (ECRTS 2020)
Publication date: 01/01/2020
Field of study

Memory-centric scheduling attempts to guarantee temporal predictability on commercial-off-the-shelf (COTS) multiprocessor systems to exploit their high performance for real-time applications. Several solutions proposed in the real-time literature have hardware requirements that are not easily satisfied by modern COTS platforms, like hardware support for strict memory partitioning or the presence of scratchpads. However, even without said hardware support, it is possible to design an efficient memory-centric scheduler. In this article, we design, implement, and analyze a memory-centric scheduler for deterministic memory management on COTS multiprocessor platforms without any hardware support. Our approach uses fixed-priority scheduling and proposes a global "memory preemption" scheme to boost real-time schedulability. The proposed scheduling protocol is implemented in the Jailhouse hypervisor and Erika real-time kernel. Measurements of the scheduler overhead demonstrate the applicability of the proposed approach, and schedulability experiments show a 20% gain in terms of schedulability when compared to contention-based and static fair-share approaches

Dagstuhl Research Online Publication Server

A Survey on Cache Management Mechanisms for Real-Time Embedded Systems

Author: ALHAMMAD AHMED
FRÖHLICH ANTÔNIO
GRACIOLI GIOVANI
MANCUSO RENATO
Pellizzoni Rodolfo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/11/2015
Field of study

© ACM, 2015. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Computing Surveys, {48, 2, (November 2015)} http://doi.acm.org/10.1145/2830555Multicore processors are being extensively used by real-time systems, mainly because of their demand for increased computing power. However, multicore processors have shared resources that affect the predictability of real-time systems, which is the key to correctly estimate the worst-case execution time of tasks. One of the main factors for unpredictability in a multicore processor is the cache memory hierarchy. Recently, many research works have proposed different techniques to deal with caches in multicore processors in the context of real-time systems. Nevertheless, a review and categorization of these techniques is still an open topic and would be very useful for the real-time community. In this article, we present a survey of cache management techniques for real-time embedded systems, from the first studies of the field in 1990 up to the latest research published in 2014. We categorize the main research works and provide a detailed comparison in terms of similarities and differences. We also identify key challenges and discuss future research directions.King Saud University NSER

University of Waterloo's Institutional Repository

Designing Mixed Criticality Applications on Modern Heterogeneous MPSoC Platforms

Author: Caccamo Marco
Gracioli Giovani
Mancuso Renato
Mirosanlou Reza
Pellizzoni Rodolfo
Tabish Rohan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st Euromicro Conference on Real-Time Systems (ECRTS 2019)
Publication date: 01/01/2019
Field of study

Multiprocessor Systems-on-Chip (MPSoC) integrating hard processing cores with programmable logic (PL) are becoming increasingly common. While these platforms have been originally designed for high performance computing applications, their rich feature set can be exploited to efficiently implement mixed criticality domains serving both critical hard real-time tasks, as well as soft real-time tasks. In this paper, we take a deep look at commercially available heterogeneous MPSoCs that incorporate PL and a multicore processor. We show how one can tailor these processors to support a mixed criticality system, where cores are strictly isolated to avoid contention on shared resources such as Last-Level Cache (LLC) and main memory. In order to avoid conflicts in last-level cache, we propose the use of cache coloring, implemented in the Jailhouse hypervisor. In addition, we employ ScratchPad Memory (SPM) inside the PL to support a multi-phase execution model for real-time tasks that avoids conflicts in shared memory. We provide a full-stack, working implementation on a latest-generation MPSoC platform, and show results based on both a set of data intensive tasks, as well as a case study based on an image processing benchmark application

Boston University Institutional Repository (OpenBU)

Dagstuhl Research Online Publication Server

ELLUS: projeto e implementação de um mecanismo de reconfiguraçao dinâmica de software para sistemas embarcados

Author: Gracioli Giovani
Publication venue
Publication date: 24/10/2012
Field of study

Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Ciência da Computação, Florianópolis, 2009.Reconfiguração dinâmica de software em ambientes computacionais convencionais é o processo de atualizar o software de um sistema em execução. Esta atividade é extremamente importante para corrigir eventuais erros, adicionar e/ou remover funcionalidades e adaptar-se às mudanças que por ventura o sistema pode sofrer durante o seu tempo de vida. Reconfiguração dinâmica de software em sistemas profundamente embarcados torna-se um desafio ainda maior devido às características de tais sistemas, que apresentam sérias limitações de processamento, memória e, quando alimentados por bateria, de energia. Neste cenário, o próprio mecanismo de reconfiguração de software deve usar o mínimo de recursos possíveis pois estará competindo com os recursos do sistema e não deve influenciar os seus serviços. Esta dissertação apresenta o Epos Live Update System (ELUS), uma infra-estrutura de sistema operacional que permite reconfiguração dinâmica de software em sistemas profundamente embarcados. Através do uso de sofisticadas técnicas de metaprogramação estática em C++, o ELUS utiliza pouca memória e o processo de reconfiguração torna-se simples e totalmente transparente para as aplicações. O ELUS é construído dentro do framework de componentes do EPOS, em torno do aspecto de invocação remota, permitindo a seleção dos componentes reconfiguráveis em tempo de compilação, sendo que para todos os outros componentes não selecionados, nenhum sobrecusto em termos de memória e processamento é adicionado no sistema. As principais características que diferem o ELUS das outras infra-estruturas de sistemas operacionais para reconfiguração dinâmica de software existentes são a configurabilidade, o baixo consumo de memória, a simplicidade e a transparência para as aplicações

Repositório Institucional da UFSC

Evaluating memory subsystem of configurable heterogeneous MPSoC

Author: Bansal Ayoosh
Caccamo Marco
Gracioli Giovani
Mancuso Renato
Pellizzoni Rodolfo
Tabish Rohan
Publication venue
Publication date: 01/05/2018
Field of study

This paper presents the evaluation of the memory subsystem of the Xilinx Ultrascale+ MPSoC. The characteristics of various memories in the system are evaluated using carefully instrumented micro-benchmarks. The impact of micro-architectural features like caches, prefetchers and cache coherency are measured and discussed. The impact of multi-core contention on shared memory resources is evaluated. Finally, proposals are made for the design of mixed-criticality real-time applications on this platform.Accepted manuscrip

Boston University Institutional Repository (OpenBU)

A Survey on Cache Management Mechanisms for Real-Time Embedded Systems

Author: Ahmed Alhammad
Antônio Augusto Fröhlich
Arnaud Alexis
Bastoni Andrea
Boyd-Wickizer Silas
Campoy Marti
Cesati M.
Chiou Derek
Cousot P.
Craciunas Silviu S.
Ding Huping
Giovani Gracioli
Gracioli G.
Grund Daniel
Hardy Damien
Herter J.
Ishikawa T.
Kirk D. B.
Kosmidis Leonidas
Lin Jiang
Mohan S.
Muralidhara S. P.
Panchamukhi S.
Puaut I.
Puaut I.
Renato Mancuso
Rodolfo Pellizzoni
Romer Theodore
Sun Q.
Sundararajan K. T.
Tam D.
Wasly I.
Wolfe Andrew
Xiaofeng G.
Yun H.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Low‐cost automotive wireless instrumentation: is it possible?

Author: Ambarish M.G.
Anderson Wedderhoff Spengler
Arif S.J.
Bajcinca N.
Briggs T.L.
Carullo A.
Civardi L.
Coulon Y.
Daniel Rossi Korol
Espadafor F.J.J.
Giovani Gracioli
Giubbolini L.
Goud V.
Hile J.W.
Ionescu B.
Jung J.
Kadhim A.H.
Lin J.R.
Lindner M.
Matsuzaki R.
Miedl F.
Nonomura Y.
Paul D.
Qin G.
Resner D.
Russell M.E.
Schmid U.
Schnabel R.
Shi L.
Sinha J.K.
Sérgio Idehara
Tachwali Y.
Uchimura Y.
Walter M.
Wang P.
Zhang N.
Öörni R.
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date
Field of study

Crossref